-
Notifications
You must be signed in to change notification settings - Fork 7.3k
fix: Prevent O(n²) memory usage in command output #9693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
The following comment was made by an LLM, it may be inaccurate: Based on my search, I found one potentially related PR that addresses similar memory concerns: Related PR:
The other search results (like #7049 about output pruning) are related to memory management but focus on different aspects. |
The bash tool and the inline command execution were accumulating command output using `output += chunk.toString()`, which creates a new string for every chunk and copies all previous content. For commands producing large output (like meson in weston), this caused catastrophic O(n²) memory usage, as the GC had not time to catch up. Large output was eventually written to disk and truncated in memory, but that was only done at the end, potentially leading to running out of memory. Fix by streaming output to disk when it exceeds 50KB threshold: - Output < 50KB: stays in memory (no file created) - Output ≥ 50KB: streams to file, 50KB sliding window in memory Also adds line count tracking to match original Truncate.MAX_LINES behavior and proper cleanup on error/abort. The session/prompt.ts InteractiveCommand uses the same pattern.
| import { Truncate } from "./truncation" | ||
|
|
||
| const MAX_METADATA_LENGTH = 30_000 | ||
| import { Log } from "../util/log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep that at 6 (now 9) just to reduce the diff?
|
Could you review this, please, @rekram1-node ? I have review it myself and I think the code looks good |
|
Hi @cgwalters your change looks even better with the addition of the filter, I like that a lot. I think the only thing that it's missing compared to mine is that I am also fixing the memory usage blowing up on the interactive command side. I'm happy to do whatever here, keep just that side on my patch and merge yours. You are a lot more knowledgeable than I am, so happy to yield, as long as we get a fix and I don't have to carry a patch to avoid OOM on my system I am happy ;D what's your take @rekram1-node? |
|
Does this actually fix the memory issues for you guys? I guess if ur running noisy commands it can be problematic btu most of the memory issues ive seen arent related to bash tool invocations |
|
Also for truncation it's fine to truncate the metadata stuff but the actual output should keep the same truncation logic as exists rn we shouldnt need to reimplement it right? For the llm at least but lemme read over this more |
|
Yes it does. Ask opencode to run find / and watch the process memory usage. I have seen opencode use 70% of my test system's memory while running nothing but a find / through either the bash tool or the interactive command (!find /). With this patch it never goes beyond 3%. |
The actual output gets saved in full to a file just like it does before the fix. The full path to that file is appended as a hint on the version of the metadata the agent sees, also the same behavior we have before the fix. I have tested it, the agent can see the full output no problem with this fix, it can tell me the first and last lines of the output, for instance. Is that what you mean, or are you talking about something else? |
The bash tool and the inline command execution were accumulating command output using
output += chunk.toString(), which creates a new string for every chunk and copies all previous content.For commands producing large output (like meson in weston), this caused catastrophic O(n²) memory usage, as the GC had not time to catch up.
Large output was eventually written to disk and truncated in memory, but that was only done at the end, potentially leading to running out of memory.
Fix by streaming output to disk when it exceeds 50KB threshold:
Also adds line count tracking to match original Truncate.MAX_LINES behavior and proper cleanup on error/abort.
The session/prompt.ts InteractiveCommand uses the same pattern.
What does this PR do?
Avoids a pathological GC case caused by repeated string concatenation that was leading to the OOM killer getting engaged on my 16GB RAM laptop when opencode ran a large meson build with verbose output. I was hitting the issue with the bash tool, but the problem also happens on the interactive command codepath, so changed that as well.
The fix is to not accumulate the whole output as a big string through repeated concatenation with Truncate.output() handling it after the tool has finished, but to do our own handling of truncation. We only accumulate up to 50K in memory (this could be improved to be a sliding window of 50K if we prefer) and when we get anything beyond that stream the output to a file. The final outcome is the same as we had before, but without explosive memory usage.
How did you verify your code works?
I ran the tests, typechecked, and ran a build of opencode that includes the fix. opencode itself tried using the tool and succeeded, was also able to look at the whole file after. I did not observe the explosive memory usage I was seeing before, so I assume that is now fixed.
A good way to test this is to do !find / and click to expand the preview - be careful as that may cause your machine to grind to a halt - you can use something else that causes long output. With this PR applied you should see the same text at the end, but you should not see the memory usage explode.